Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build against NPY2 #3894

Merged
merged 30 commits into from
Aug 8, 2024
Merged

build against NPY2 #3894

merged 30 commits into from
Aug 8, 2024

Conversation

njzjz
Copy link
Contributor

@njzjz njzjz commented Jun 25, 2024

Summary

Major changes:

  • fix 1: Build against NPY2 instead of NPY1. This will support both NPY1 and NPY2. See https://numpy.org/devdocs/dev/depending_on_numpy.html#numpy-2-0-specific-advice.
  • fix 2: Replace np.int_t (missing in NPY2) with np.int64_t and np.float_ with np.float64.
  • fix 3: Replace long with np.int64_t, as "The default integer type on Windows is now int64 rather than
    int32, matching the behavior on other platforms" per the change log of NPY2.

Todos

If this is work in progress, what else needs to be done?

  • feature 2: ...
  • fix 2:

Checklist

  • Google format doc strings added. Check with ruff.
  • Type annotations included. Check with mypy.
  • Tests added for new features/fixes.
  • If applicable, new classes/functions/modules have duecredit @due.dcite decorators to reference relevant papers by DOI (example)

Tip: Install pre-commit hooks to auto-check types and linting before every commit:

pip install -U pre-commit
pre-commit install

@njzjz njzjz marked this pull request as draft June 25, 2024 20:36
@njzjz
Copy link
Contributor Author

njzjz commented Jun 25, 2024

Unluckily, there are still several errors. Some are related to the upstream chgnet and scipy and cannot be fixed here.

@janosh
Copy link
Member

janosh commented Aug 5, 2024

@njzjz sorry for the lack of reply here! looks like this is almost finished. thanks for the effort! feel free to skip the failing chgnet tests for now. re scipy, we'll have to pin to <1.14 as the newest release dropped Python 3.9 support. but 1.13.0 already supports numpy v2

@janosh janosh added compatability Concerning pymatgen compatibility with different OS, Python versions, numpy versions, etc. pkg Package health and distribution related stuff labels Aug 5, 2024
@DanielYang59
Copy link
Contributor

DanielYang59 commented Aug 5, 2024

I really appreciate the pioneering work @njzjz! Meanwhile in case anyone missed this, ruff now has a dedicated rule NPY201 to handle API migration automatically :)

ruff check path/to/code/ --select NPY201

@njzjz njzjz marked this pull request as ready for review August 5, 2024 19:42
@njzjz
Copy link
Contributor Author

njzjz commented Aug 5, 2024

For cross reference, I've submitted the SciPy issue to scipy/scipy#21052

@njzjz
Copy link
Contributor Author

njzjz commented Aug 5, 2024

I do not know how to fix the rest of the errors and need help.

@janosh
Copy link
Member

janosh commented Aug 5, 2024

@njzjz feel free to skip the two failing chgnet tests. we'll migrate it to numpy v2 once pymatgen has. for test_delta_func, you can conditionally skip it like this:

@pytest.mark.skipif(sys.platform == "win32" and np.__version__ > "2.0.0", reason="Fails on Windows with numpy > 2.0.0, awaiting https://github.com/scipy/scipy/issues/21052 resolution")
def test_delta_func():
    x = np.array([0, 1, 2, 3, 4, 5])

not sure about TestWavecar.test_get_parchg. that might need a closer look. maybe @esoteric-ephemera has some ideas on what's going wrong?

@DanielYang59
Copy link
Contributor

DanielYang59 commented Aug 6, 2024

Just did some investigation on the failure of TestWavecar.test_get_parchg, unfortunately I wasn't able to come to a definite conclusion, but still share my findings in case it's helpful.

I believe this failure has something to do with the float point precision (though I don't know exactly how), with the following script on Windows Python 3.12:

import numpy as np

from pymatgen.io.vasp import Poscar, Wavecar


TEST_DIR = "tests/files/io/vasp"
poscar = Poscar.from_file(f"{TEST_DIR}/inputs/POSCAR")
w = Wavecar(f"{TEST_DIR}/outputs/WAVECAR.H2.ncl")

w.coeffs.append([np.ones((2, 100))])
c = w.get_parchg(poscar, -1, 0, phase=False, spinor=None)

print(np.min(c.data["total"]))  # 0.0 in NumPy 1.x, 8.738907122657083e-33 in NumPy 2.x
assert not np.all(c.data["total"] > 0.0)  # AssertionError

Note the min value in c.data["total"] is very close to zero (the min positive value with float32 is around 10-38).

From the "NumPy 2.0 migration guide", there seems to be two changes related to this:

  1. The default Windows integer was int32 for NumPy 1.x, and is now int64 (confirmed on my Windows PC), for Linux/Mac it's always int64
  2. The change in the scalar promotion rules

However, with these changes, I would expect the precision on Windows to be consistent with Linux on NumPy 2.x. Perhaps someone more familiar with the internal workings of Wavecar would know the reason :)

Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @njzjz for this work and esp. for reporting multiple upstream issues! 🥇 and thanks @DanielYang59 for supporting! good team effort to get the many failing tests here sorted out.

@DanielYang59
Copy link
Contributor

The failing tests for tests/io/abinit/test_abiobjects.py looks like another randomly failing test. I cannot recreate it on my Windows PC.

@@ -50,7 +50,8 @@ class LinearAssignment:
"""

def __init__(self, costs, epsilon=1e-13):
self.orig_c = np.array(costs, dtype=np.float_, copy=False, order="C")
# https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janosh This doesn't look like something we should keep inside the code (it's pretty easy to find)? What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm generally in favor of linking relevant docs. even if easy to find, people might not think to look for them. but no strong opinion in this case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks more like a "disposable" migration guide which we would only do once :)

@janosh
Copy link
Member

janosh commented Aug 6, 2024

The failing tests for tests/io/abinit/test_abiobjects.py looks like another randomly failing test. I cannot recreate it on my Windows PC.

probably something to report to https://github.com/Unidata/netcdf4-python, even if just to say the error message is self-contradictory...

E OSError: Error loading charset-233f44715e5aacaf3a688c2faff5ddf7.dll; The operation completed successfully.

@DanielYang59
Copy link
Contributor

The failing tests for tests/io/abinit/test_abiobjects.py looks like another randomly failing test. I cannot recreate it on my Windows PC.

probably something to report to https://github.com/Unidata/netcdf4-python, even if just to say the error message is self-contradictory...

Agreed, just for my own record, it failed in another PR randomly https://github.com/materialsproject/pymatgen/actions/runs/10268736299/job/28412461926?pr=3892

@njzjz
Copy link
Contributor Author

njzjz commented Aug 6, 2024

E OSError: Error loading charset-233f44715e5aacaf3a688c2faff5ddf7.dll; The operation completed successfully.

This part of the code is generated by delvewheel here. I think I can report there.

According to the Windows documentation, this error message means no error. So it may be incorrect to throw an error (while it is also strange that LoadLibraryExW returns nothing but gives ERROR_SUCCESS).

ERROR_SUCCESS
0 (0x0)
The operation completed successfully.

See also https://stackoverflow.com/questions/7524142/what-does-windows-error-0-error-success-mean

@njzjz
Copy link
Contributor Author

njzjz commented Aug 6, 2024

E OSError: Error loading charset-233f44715e5aacaf3a688c2faff5ddf7.dll; The operation completed successfully.

Submitted to adang1345/delvewheel#51

@janosh
Copy link
Member

janosh commented Aug 8, 2024

@njzjz can you pin pyproject to the fixed delvewheel>=1.7.4

janosh and others added 2 commits August 8, 2024 09:16
auto-merge was automatically disabled August 8, 2024 18:05

Head branch was pushed to by a user without write access

@njzjz
Copy link
Contributor Author

njzjz commented Aug 8, 2024

@njzjz can you pin pyproject to the fixed delvewheel>=1.7.4

This may not work, as delvewheel is a build-time dependency, not a runtime dependency.

@janosh
Copy link
Member

janosh commented Aug 8, 2024

This may not work, as delvewheel is a build-time dependency, not a runtime dependency.

ah, didn't realize that. so the place to pin that would be here

[build-system]
requires = [
"Cython>=0.29.23",
# Building against NPY2 will support both NPY1 and NPY2

but looks like we're in the green anyway so would be good to remove the temp install command in test.yml

@janosh
Copy link
Member

janosh commented Aug 8, 2024

thanks again @njzjz! 👍

@janosh janosh merged commit ce360f4 into materialsproject:master Aug 8, 2024
33 checks passed
@DanielYang59
Copy link
Contributor

Just recorded another failure in https://github.com/materialsproject/pymatgen/actions/runs/10319015504/job/28566565139:

OSError: Error loading libxml2-14375db763c77d38e40bbe5c028add84.dll; The operation completed successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatability Concerning pymatgen compatibility with different OS, Python versions, numpy versions, etc. pkg Package health and distribution related stuff
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants